Probability and Statistics: The Science of Uncertainty: Quantifying Uncertainty: The Random Variable Function

In this session, we transition from describing outcomes qualitatively to a rigorous quantitative framework. We define a random variable not as a "variable" in the algebraic sense, but as a deterministic mapping—a function—that translates elements of a sample space into the real number line.

The Functional Definition (Definition 2.1.1)

A random variable $X$ is a function $X: S \to R^1$ that assigns a real number $X(s)$ to every possible outcome $s$ in the sample space $S$. Refer to Figure 2.1.1 for the visual mapping of this process.

The Indicator Function ($I_A$)

To bridge set theory and arithmetic, we define the indicator function of an event $A$:

$$I_A(s) = \begin{cases} 1 & s \in A \\ 0 & s \notin A \end{cases}$$

This transforms the occurrence of an event into a binary numerical signal.

Defining Distributions (Definition 2.2.1)

The "distribution" of $X$ is the collection of probabilities $P(X \in B)$ for subsets $B \subseteq R^1$. Strictly speaking, it is required that $B$ be a Borel subset, which is a technical restriction from measure theory. However, any subset we can practically define is a Borel subset.

Limits and Continuity of Probability

To ensure our functions behave predictably in infinite contexts, we rely on the axioms established in Theorems 1.3.4 and 1.6.1:

Countable Additivity (1.7.1): $P(A_1 \cup A_2 \cup \cdots) = \sum P(B_n)$, where $B_n$ are disjoint versions of $A_n$.
Continuity of Probability (1.7.2): If a sequence of events $\{A_n\} \nearrow A$, then $\lim_{n \to \infty} P(A_n) = P(A)$.

Proof of Theorem 1.3.4

We want to prove that for any sequence of events $A_1, A_2, \dots$ (not necessarily disjoint):

$$P(A_1 \cup A_2 \cup \cdots) \leq P(A_1) + P(A_2) + \cdots$$

This is known as Boole's Inequality and is fundamental to bounding probabilities in complex systems.

🎯 Historical Context

The term "random variable" was chosen over "chance variable" by Joe Doob and William Feller via a coin flip in the early 1950s. While technically it is a function, the name "variable" stuck as a historical artifact of this famous flip.

QUESTION 1

Let $S = {1, 2, 3, \dots}$, and let $X(s) = s^2$ and $Y(s) = 1/s$ for $s \in S$. Do these quantities exist as random variables?

Yes, because they are functions mapping each s to a real number.

No, because S is an infinite set.

Only X exists; Y is a fraction and thus not a variable.

Neither exists because their ranges are not Borel sets.

QUESTION 2

Consider rolling one fair six-sided die ($S = {1,2,3,4,5,6}$). Let $X(s) = s$. Let $Y = X^2$. What is $P(Y \le 10)$?

$1/2$

$1/6$

$3/6$

$10/36$

QUESTION 3

Which mathematician won the coin flip that determined the name 'Random Variable' instead of 'Chance Variable'?

Joe Doob and William Feller (Random Variable won)

Kolmogorov (Stochastic Variable won)

Thomas Bayes (Prior Variable won)

Pierre-Simon Laplace (Chance Variable won)

QUESTION 4

What is the primary role of the Indicator Function $I_A$?

To act as a bridge between set theory (events) and arithmetic.

To prove that a set is Borel.

To calculate the derivative of a distribution.

To determine if a sample space is discrete.

QUESTION 5

According to Theorem 1.6.1 (Continuity of Probability), if we have a sequence of events where $A_n \nearrow A$, what can we say about the limit?

lim P(A_n) = P(A)

lim P(A_n) = 1

P(A) must be 0

The limit does not exist for unbounded variables.